Overview

Dataset statistics

Number of variables17
Number of observations10757
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.7 MiB
Average record size in memory168.6 B

Variable types

DateTime2
Numeric9
Categorical5
Unsupported1

Alerts

mta_tax has constant value "0.5"Constant
improvement_surcharge has constant value "0.3"Constant
RatecodeID is highly overall correlated with extra and 3 other fieldsHigh correlation
duration_minutes is highly overall correlated with fare_amount and 2 other fieldsHigh correlation
extra is highly overall correlated with RatecodeID and 3 other fieldsHigh correlation
fare_amount is highly overall correlated with RatecodeID and 4 other fieldsHigh correlation
tip_amount is highly overall correlated with total_amountHigh correlation
total_amount is highly overall correlated with RatecodeID and 5 other fieldsHigh correlation
trip_distance is highly overall correlated with RatecodeID and 4 other fieldsHigh correlation
RatecodeID is highly imbalanced (94.9%)Imbalance
payment_type is highly imbalanced (52.6%)Imbalance
duration_minutes is highly skewed (γ1 = 20.27967956)Skewed
trip_duration is an unsupported type, check if it needs cleaning or further analysisUnsupported
tip_amount has 3654 (34.0%) zerosZeros
tolls_amount has 10368 (96.4%) zerosZeros

Reproduction

Analysis started2025-12-04 17:00:48.653365
Analysis finished2025-12-04 17:00:53.602831
Duration4.95 seconds
Software versionydata-profiling vv4.18.0
Download configurationconfig.json

Variables

Distinct10752
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size426.1 KiB
Minimum2017-01-01 00:08:25
Maximum2017-12-31 23:45:30
Invalid dates0
Invalid dates (%)0.0%
2025-12-04T14:00:53.703221image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:53.878274image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct10752
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size426.1 KiB
Minimum2017-01-01 00:17:20
Maximum2017-12-31 23:49:24
Invalid dates0
Invalid dates (%)0.0%
2025-12-04T14:00:53.995040image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:54.126911image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

passenger_count
Real number (ℝ)

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6362369
Minimum0
Maximum6
Zeros9
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size426.1 KiB
2025-12-04T14:00:54.205285image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.2572436
Coefficient of variation (CV)0.76837506
Kurtosis3.8242532
Mean1.6362369
Median Absolute Deviation (MAD)0
Skewness2.1835228
Sum17601
Variance1.5806615
MonotonicityNot monotonic
2025-12-04T14:00:54.251727image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
17572
70.4%
21666
 
15.5%
5542
 
5.0%
3459
 
4.3%
6287
 
2.7%
4222
 
2.1%
09
 
0.1%
ValueCountFrequency (%)
09
 
0.1%
17572
70.4%
21666
 
15.5%
3459
 
4.3%
4222
 
2.1%
5542
 
5.0%
6287
 
2.7%
ValueCountFrequency (%)
6287
 
2.7%
5542
 
5.0%
4222
 
2.1%
3459
 
4.3%
21666
 
15.5%
17572
70.4%
09
 
0.1%

trip_distance
Real number (ℝ)

High correlation 

Distinct1124
Distinct (%)10.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.8419513
Minimum0
Maximum30.83
Zeros62
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size426.1 KiB
2025-12-04T14:00:54.322870image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.5
Q11
median1.74
Q33.24
95-th percentile9.5
Maximum30.83
Range30.83
Interquartile range (IQR)2.24

Descriptive statistics

Standard deviation3.1753071
Coefficient of variation (CV)1.1172982
Kurtosis10.830603
Mean2.8419513
Median Absolute Deviation (MAD)0.89
Skewness2.8922334
Sum30570.87
Variance10.082575
MonotonicityNot monotonic
2025-12-04T14:00:54.394718image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1252
 
2.3%
1.1235
 
2.2%
0.8220
 
2.0%
0.9215
 
2.0%
1.2209
 
1.9%
0.7203
 
1.9%
1.3185
 
1.7%
1.4183
 
1.7%
0.6175
 
1.6%
1.5172
 
1.6%
Other values (1114)8708
81.0%
ValueCountFrequency (%)
062
0.6%
0.013
 
< 0.1%
0.024
 
< 0.1%
0.032
 
< 0.1%
0.042
 
< 0.1%
0.061
 
< 0.1%
0.072
 
< 0.1%
0.081
 
< 0.1%
0.116
 
0.1%
0.111
 
< 0.1%
ValueCountFrequency (%)
30.831
< 0.1%
27.971
< 0.1%
27.881
< 0.1%
27.341
< 0.1%
26.541
< 0.1%
25.861
< 0.1%
25.81
< 0.1%
24.891
< 0.1%
24.611
< 0.1%
24.11
< 0.1%

RatecodeID
Categorical

High correlation  Imbalance 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size867.3 KiB
1
10653 
2
 
101
4
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10757
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
110653
99.0%
2101
 
0.9%
43
 
< 0.1%

Length

2025-12-04T14:00:54.459731image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-04T14:00:54.497160image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
110653
99.0%
2101
 
0.9%
43
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
110653
99.0%
2101
 
0.9%
43
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)10757
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
110653
99.0%
2101
 
0.9%
43
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)10757
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
110653
99.0%
2101
 
0.9%
43
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)10757
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
110653
99.0%
2101
 
0.9%
43
 
< 0.1%

PULocationID
Real number (ℝ)

Distinct125
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean160.56912
Minimum4
Maximum265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size426.1 KiB
2025-12-04T14:00:54.556607image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile48
Q1113
median161
Q3231
95-th percentile249
Maximum265
Range261
Interquartile range (IQR)118

Descriptive statistics

Standard deviation66.02353
Coefficient of variation (CV)0.41118449
Kurtosis-0.94669805
Mean160.56912
Median Absolute Deviation (MAD)68
Skewness-0.19541278
Sum1727242
Variance4359.1065
MonotonicityNot monotonic
2025-12-04T14:00:54.638102image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
230411
 
3.8%
48411
 
3.8%
234406
 
3.8%
79402
 
3.7%
162400
 
3.7%
161395
 
3.7%
237366
 
3.4%
186349
 
3.2%
170341
 
3.2%
163318
 
3.0%
Other values (115)6958
64.7%
ValueCountFrequency (%)
430
 
0.3%
717
 
0.2%
101
 
< 0.1%
122
 
< 0.1%
1375
0.7%
142
 
< 0.1%
175
 
< 0.1%
2422
 
0.2%
2516
 
0.1%
281
 
< 0.1%
ValueCountFrequency (%)
2652
 
< 0.1%
264182
1.7%
263166
1.5%
26264
 
0.6%
26152
 
0.5%
2608
 
0.1%
2581
 
< 0.1%
2569
 
0.1%
25523
 
0.2%
249311
2.9%

DOLocationID
Real number (ℝ)

Distinct194
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean158.22692
Minimum4
Maximum265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size426.1 KiB
2025-12-04T14:00:54.708480image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile41
Q1100
median161
Q3233
95-th percentile261
Maximum265
Range261
Interquartile range (IQR)133

Descriptive statistics

Standard deviation72.613237
Coefficient of variation (CV)0.45891834
Kurtosis-1.0866039
Mean158.22692
Median Absolute Deviation (MAD)70
Skewness-0.2512199
Sum1702047
Variance5272.6821
MonotonicityNot monotonic
2025-12-04T14:00:54.777295image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
48364
 
3.4%
170321
 
3.0%
236320
 
3.0%
230312
 
2.9%
79306
 
2.8%
186292
 
2.7%
239279
 
2.6%
142278
 
2.6%
141262
 
2.4%
107251
 
2.3%
Other values (184)7772
72.3%
ValueCountFrequency (%)
464
0.6%
762
0.6%
92
 
< 0.1%
105
 
< 0.1%
111
 
< 0.1%
126
 
0.1%
1386
0.8%
1412
 
0.1%
153
 
< 0.1%
162
 
< 0.1%
ValueCountFrequency (%)
26512
 
0.1%
264166
1.5%
263213
2.0%
262144
1.3%
26141
 
0.4%
26015
 
0.1%
2593
 
< 0.1%
25716
 
0.1%
25642
 
0.4%
25566
 
0.6%

payment_type
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size867.3 KiB
1
7407 
2
3275 
3
 
59
4
 
16

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10757
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
17407
68.9%
23275
30.4%
359
 
0.5%
416
 
0.1%

Length

2025-12-04T14:00:54.836140image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-04T14:00:54.880959image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
17407
68.9%
23275
30.4%
359
 
0.5%
416
 
0.1%

Most occurring characters

ValueCountFrequency (%)
17407
68.9%
23275
30.4%
359
 
0.5%
416
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)10757
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
17407
68.9%
23275
30.4%
359
 
0.5%
416
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)10757
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
17407
68.9%
23275
30.4%
359
 
0.5%
416
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)10757
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
17407
68.9%
23275
30.4%
359
 
0.5%
416
 
0.1%

fare_amount
Real number (ℝ)

High correlation 

Distinct128
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.296551
Minimum2.5
Maximum85.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size426.1 KiB
2025-12-04T14:00:54.944987image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum2.5
5-th percentile4.5
Q16.5
median9.5
Q314.5
95-th percentile31
Maximum85.5
Range83
Interquartile range (IQR)8

Descriptive statistics

Standard deviation9.172736
Coefficient of variation (CV)0.74596006
Kurtosis7.1611999
Mean12.296551
Median Absolute Deviation (MAD)3.5
Skewness2.3598243
Sum132274
Variance84.139085
MonotonicityNot monotonic
2025-12-04T14:00:55.011242image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6555
 
5.2%
6.5514
 
4.8%
5.5508
 
4.7%
7506
 
4.7%
7.5487
 
4.5%
5470
 
4.4%
8.5456
 
4.2%
8436
 
4.1%
9435
 
4.0%
9.5411
 
3.8%
Other values (118)5979
55.6%
ValueCountFrequency (%)
2.567
 
0.6%
347
 
0.4%
3.5147
 
1.4%
4268
2.5%
4.5357
3.3%
5470
4.4%
5.5508
4.7%
6555
5.2%
6.5514
4.8%
7506
4.7%
ValueCountFrequency (%)
85.51
< 0.1%
801
< 0.1%
781
< 0.1%
761
< 0.1%
731
< 0.1%
72.51
< 0.1%
70.51
< 0.1%
67.51
< 0.1%
662
< 0.1%
64.51
< 0.1%

extra
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size888.3 KiB
0.5
7095 
1.0
3561 
4.5
 
101

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters32271
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.5
2nd row0.5
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.57095
66.0%
1.03561
33.1%
4.5101
 
0.9%

Length

2025-12-04T14:00:55.068686image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-04T14:00:55.103827image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0.57095
66.0%
1.03561
33.1%
4.5101
 
0.9%

Most occurring characters

ValueCountFrequency (%)
.10757
33.3%
010656
33.0%
57196
22.3%
13561
 
11.0%
4101
 
0.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)32271
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
.10757
33.3%
010656
33.0%
57196
22.3%
13561
 
11.0%
4101
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)32271
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
.10757
33.3%
010656
33.0%
57196
22.3%
13561
 
11.0%
4101
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)32271
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
.10757
33.3%
010656
33.0%
57196
22.3%
13561
 
11.0%
4101
 
0.3%

mta_tax
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size888.3 KiB
0.5
10757 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters32271
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.5
2nd row0.5
3rd row0.5
4th row0.5
5th row0.5

Common Values

ValueCountFrequency (%)
0.510757
100.0%

Length

2025-12-04T14:00:55.145940image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-04T14:00:55.175925image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0.510757
100.0%

Most occurring characters

ValueCountFrequency (%)
010757
33.3%
.10757
33.3%
510757
33.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)32271
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
010757
33.3%
.10757
33.3%
510757
33.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)32271
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
010757
33.3%
.10757
33.3%
510757
33.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)32271
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
010757
33.3%
.10757
33.3%
510757
33.3%

tip_amount
Real number (ℝ)

High correlation  Zeros 

Distinct545
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.7989226
Minimum0
Maximum42.29
Zeros3654
Zeros (%)34.0%
Negative0
Negative (%)0.0%
Memory size426.1 KiB
2025-12-04T14:00:55.229346image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1.46
Q32.5
95-th percentile5.75
Maximum42.29
Range42.29
Interquartile range (IQR)2.5

Descriptive statistics

Standard deviation2.1840416
Coefficient of variation (CV)1.2140832
Kurtosis21.73657
Mean1.7989226
Median Absolute Deviation (MAD)1.46
Skewness3.021124
Sum19351.01
Variance4.7700376
MonotonicityNot monotonic
2025-12-04T14:00:55.293766image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03654
34.0%
1708
 
6.6%
2379
 
3.5%
1.5164
 
1.5%
1.66115
 
1.1%
3115
 
1.1%
1.96105
 
1.0%
2.06104
 
1.0%
1.46102
 
0.9%
1.45102
 
0.9%
Other values (535)5209
48.4%
ValueCountFrequency (%)
03654
34.0%
0.016
 
0.1%
0.022
 
< 0.1%
0.041
 
< 0.1%
0.081
 
< 0.1%
0.14
 
< 0.1%
0.121
 
< 0.1%
0.25
 
< 0.1%
0.262
 
< 0.1%
0.341
 
< 0.1%
ValueCountFrequency (%)
42.291
< 0.1%
281
< 0.1%
251
< 0.1%
22.221
< 0.1%
201
< 0.1%
18.921
< 0.1%
18.561
< 0.1%
17.192
< 0.1%
15.951
< 0.1%
15.761
< 0.1%

tolls_amount
Real number (ℝ)

Zeros 

Distinct16
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.20604537
Minimum0
Maximum17.28
Zeros10368
Zeros (%)96.4%
Negative0
Negative (%)0.0%
Memory size426.1 KiB
2025-12-04T14:00:55.347765image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum17.28
Range17.28
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.0835814
Coefficient of variation (CV)5.2589457
Kurtosis32.996151
Mean0.20604537
Median Absolute Deviation (MAD)0
Skewness5.4578182
Sum2216.43
Variance1.1741486
MonotonicityNot monotonic
2025-12-04T14:00:55.396341image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
010368
96.4%
5.76273
 
2.5%
5.5494
 
0.9%
2.646
 
0.1%
2.545
 
< 0.1%
11.521
 
< 0.1%
2.161
 
< 0.1%
8.51
 
< 0.1%
17.281
 
< 0.1%
5.491
 
< 0.1%
Other values (6)6
 
0.1%
ValueCountFrequency (%)
010368
96.4%
2.161
 
< 0.1%
2.545
 
< 0.1%
2.646
 
0.1%
2.71
 
< 0.1%
5.161
 
< 0.1%
5.491
 
< 0.1%
5.5494
 
0.9%
5.76273
 
2.5%
6.321
 
< 0.1%
ValueCountFrequency (%)
17.281
 
< 0.1%
16.621
 
< 0.1%
11.521
 
< 0.1%
10.51
 
< 0.1%
8.51
 
< 0.1%
8.41
 
< 0.1%
6.321
 
< 0.1%
5.76273
2.5%
5.5494
 
0.9%
5.491
 
< 0.1%

improvement_surcharge
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size888.3 KiB
0.3
10757 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters32271
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.3
2nd row0.3
3rd row0.3
4th row0.3
5th row0.3

Common Values

ValueCountFrequency (%)
0.310757
100.0%

Length

2025-12-04T14:00:55.450715image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-04T14:00:55.481083image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0.310757
100.0%

Most occurring characters

ValueCountFrequency (%)
010757
33.3%
.10757
33.3%
310757
33.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)32271
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
010757
33.3%
.10757
33.3%
310757
33.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)32271
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
010757
33.3%
.10757
33.3%
310757
33.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)32271
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
010757
33.3%
.10757
33.3%
310757
33.3%

total_amount
Real number (ℝ)

High correlation 

Distinct915
Distinct (%)8.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.806325
Minimum3.8
Maximum111.38
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size426.1 KiB
2025-12-04T14:00:55.527581image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum3.8
5-th percentile6.3
Q18.8
median12.35
Q317.88
95-th percentile39.36
Maximum111.38
Range107.58
Interquartile range (IQR)9.08

Descriptive statistics

Standard deviation11.304079
Coefficient of variation (CV)0.71516174
Kurtosis8.6127884
Mean15.806325
Median Absolute Deviation (MAD)4.05
Skewness2.5823137
Sum170028.64
Variance127.7822
MonotonicityNot monotonic
2025-12-04T14:00:55.591127image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.3257
 
2.4%
8.3235
 
2.2%
7.8235
 
2.2%
6.8230
 
2.1%
8.8229
 
2.1%
10.3223
 
2.1%
10.8210
 
2.0%
9.3206
 
1.9%
9.8199
 
1.8%
6.3182
 
1.7%
Other values (905)8551
79.5%
ValueCountFrequency (%)
3.838
 
0.4%
4.338
 
0.4%
4.561
 
< 0.1%
4.751
 
< 0.1%
4.861
0.6%
52
 
< 0.1%
5.152
 
< 0.1%
5.162
 
< 0.1%
5.282
 
< 0.1%
5.399
0.9%
ValueCountFrequency (%)
111.381
< 0.1%
99.591
< 0.1%
92.841
< 0.1%
91.91
< 0.1%
89.441
< 0.1%
89.161
< 0.1%
88.561
< 0.1%
86.761
< 0.1%
85.281
< 0.1%
85.061
< 0.1%

trip_duration
Unsupported

Rejected  Unsupported 

Missing0
Missing (%)0.0%
Memory size426.1 KiB

duration_minutes
Real number (ℝ)

High correlation  Skewed 

Distinct2270
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.767031
Minimum-16.983333
Maximum1439.55
Zeros16
Zeros (%)0.1%
Negative1
Negative (%)< 0.1%
Memory size426.1 KiB
2025-12-04T14:00:55.650718image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-16.983333
5-th percentile2.95
Q16.5666667
median10.916667
Q317.516667
95-th percentile33.486667
Maximum1439.55
Range1456.5333
Interquartile range (IQR)10.95

Descriptive statistics

Standard deviation67.381783
Coefficient of variation (CV)4.018707
Kurtosis420.26994
Mean16.767031
Median Absolute Deviation (MAD)5.0333333
Skewness20.27968
Sum180362.95
Variance4540.3047
MonotonicityNot monotonic
2025-12-04T14:00:55.711269image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.38333333320
 
0.2%
7.0518
 
0.2%
8.56666666718
 
0.2%
6.118
 
0.2%
9.38333333318
 
0.2%
3.78333333318
 
0.2%
7.56666666718
 
0.2%
5.117
 
0.2%
5.31666666717
 
0.2%
8.58333333317
 
0.2%
Other values (2260)10578
98.3%
ValueCountFrequency (%)
-16.983333331
 
< 0.1%
016
0.1%
0.033333333335
 
< 0.1%
0.057
0.1%
0.066666666671
 
< 0.1%
0.083333333334
 
< 0.1%
0.15
 
< 0.1%
0.13333333334
 
< 0.1%
0.153
 
< 0.1%
0.16666666672
 
< 0.1%
ValueCountFrequency (%)
1439.551
< 0.1%
1439.151
< 0.1%
1438.651
< 0.1%
1438.551
< 0.1%
1438.4666671
< 0.1%
1438.2666671
< 0.1%
1437.8333331
< 0.1%
1436.51
< 0.1%
1435.81
< 0.1%
1433.9833331
< 0.1%

Interactions

2025-12-04T14:00:52.934361image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.138905image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.601574image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.078908image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.530586image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.994549image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.440588image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.871272image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.375902image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.986055image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.189166image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.658603image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.129165image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.582919image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.043276image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.487798image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.928525image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.424364image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:53.038575image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.237973image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.707721image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.176876image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.643067image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.092219image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.537032image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.987416image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.475712image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:53.093497image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.289197image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.758000image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.227321image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.696121image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.137641image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.581545image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.039842image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.524020image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:53.160456image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.344833image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.810071image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.273215image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.745797image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.183726image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.628442image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.094918image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.577646image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:53.208544image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.393731image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.870222image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.326239image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.793484image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.228668image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.679359image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.146674image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.628337image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:53.259037image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.440860image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.920933image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.372069image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.838665image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.272780image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.723533image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.198381image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.683192image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:53.312652image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.494447image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.973695image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.429597image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.890364image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.329017image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.774220image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.263310image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.834721image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:53.361938image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:49.545063image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.026695image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.478997image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:50.938529image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.375488image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:51.820700image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.321363image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-12-04T14:00:52.882740image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-12-04T14:00:55.766088image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
DOLocationIDPULocationIDRatecodeIDduration_minutesextrafare_amountpassenger_countpayment_typetip_amounttolls_amounttotal_amounttrip_distance
DOLocationID1.0000.1100.115-0.0660.132-0.0730.0100.036-0.006-0.007-0.065-0.076
PULocationID0.1101.0000.138-0.0620.153-0.073-0.0060.020-0.014-0.052-0.065-0.074
RatecodeID0.1150.1381.0000.0000.7070.7570.0000.0000.2590.2780.5990.619
duration_minutes-0.066-0.0620.0001.0000.0090.9600.0220.0000.3810.2590.9400.843
extra0.1320.1530.7070.0091.0000.5870.0160.0000.2590.2760.5380.505
fare_amount-0.073-0.0730.7570.9600.5871.0000.0220.0390.4000.2990.9780.935
passenger_count0.010-0.0060.0000.0220.0160.0221.0000.026-0.0230.0160.0150.030
payment_type0.0360.0200.0000.0000.0000.0390.0261.0000.1220.0120.0890.026
tip_amount-0.006-0.0140.2590.3810.2590.400-0.0230.1221.0000.1780.5420.385
tolls_amount-0.007-0.0520.2780.2590.2760.2990.0160.0120.1781.0000.3110.295
total_amount-0.065-0.0650.5990.9400.5380.9780.0150.0890.5420.3111.0000.913
trip_distance-0.076-0.0740.6190.8430.5050.9350.0300.0260.3850.2950.9131.000

Missing values

2025-12-04T14:00:53.450696image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-12-04T14:00:53.537007image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

tpep_pickup_datetimetpep_dropoff_datetimepassenger_counttrip_distanceRatecodeIDPULocationIDDOLocationIDpayment_typefare_amountextramta_taxtip_amounttolls_amountimprovement_surchargetotal_amounttrip_durationduration_minutes
42017-04-15 23:32:202017-04-15 23:49:0314.3714112216.50.50.50.000.00.317.800 days 00:16:4316.716667
52017-03-25 20:34:112017-03-25 20:42:1162.30116123619.00.50.52.060.00.312.360 days 00:08:008.000000
62017-05-03 19:04:092017-05-03 20:03:47112.83179241147.51.00.59.860.00.359.160 days 00:59:3859.633333
72017-08-15 17:41:062017-08-15 18:03:0512.981237114116.01.00.51.780.00.319.580 days 00:21:5921.983333
122017-06-09 19:00:262017-06-09 19:20:1113.00113148115.01.00.53.350.00.320.150 days 00:19:4519.750000
132017-11-06 23:35:052017-11-06 23:42:5712.3912092519.50.50.52.160.00.312.960 days 00:07:527.866667
162017-08-15 19:48:082017-08-15 20:00:3713.60116341112.51.00.52.850.00.317.150 days 00:12:2912.483333
182017-04-10 18:12:582017-04-10 18:17:3920.63126326225.01.00.50.000.00.36.800 days 00:04:414.683333
192017-03-05 04:01:072017-03-05 04:14:1122.7717968111.50.50.53.200.00.316.000 days 00:13:0413.066667
202017-12-30 23:52:442017-12-30 23:58:5711.10116623826.50.50.50.000.00.37.800 days 00:06:136.216667
tpep_pickup_datetimetpep_dropoff_datetimepassenger_counttrip_distanceRatecodeIDPULocationIDDOLocationIDpayment_typefare_amountextramta_taxtip_amounttolls_amountimprovement_surchargetotal_amounttrip_durationduration_minutes
226812017-06-09 18:24:492017-06-09 18:36:1511.79123414419.51.00.51.000.000.312.300 days 00:11:2611.433333
226832017-08-03 17:30:042017-08-03 17:41:5211.17110717018.51.00.52.060.000.312.360 days 00:11:4811.800000
226842017-08-03 16:36:322017-08-03 16:46:2321.201685018.01.00.51.950.000.311.750 days 00:09:519.850000
226852017-07-05 22:42:462017-07-05 22:49:2911.0111447916.50.50.51.560.000.39.360 days 00:06:436.716667
226862017-02-08 18:13:262017-02-08 19:34:11510.64117070152.01.00.514.845.540.374.180 days 01:20:4580.750000
226882017-08-05 21:23:292017-08-05 21:26:1130.44123016324.00.50.50.000.000.35.300 days 00:02:422.700000
226912017-01-06 01:50:142017-01-06 01:56:4712.1211707918.00.50.50.000.000.39.300 days 00:06:336.550000
226922017-07-16 03:22:512017-07-16 03:40:5215.70124917119.00.50.54.050.000.324.350 days 00:18:0118.016667
226932017-08-10 22:20:042017-08-10 22:29:3110.89122917017.50.50.51.760.000.310.560 days 00:09:279.450000
226942017-02-24 17:37:232017-02-24 17:40:3930.6114818624.01.00.50.000.000.35.800 days 00:03:163.266667